17.1 DNA Sequencing

253

the unknown DNA. 9 One should also note inexpensive methods designed to detect

the presence of a mutation in a sequence; steady progress in automation is enabling

ever larger pieces of DNA to be tackled.

17.1.4

Expressed Sequence Tags

Expressed sequence tags (ESTs) are derived from the cDNA complementary to

mRNA. They consist of the sequence of typically 200–600 bases of a gene, suf-

ficient to uniquely identify the gene. The importance of ESTs is, however, tending

to diminish as sequencing methods become more powerful.

Expressed sequence tags are generated by isolating the mRNA from a particular

cell line or tissue and reverse-transcribing it into cDNA, which is then cloned into a

vector to make a “library”. 10 Some 400 bases from the ends of individual clones are

then sequenced.

If they overlap, ESTs can be used to reconstruct the whole sequence as in shotgun

sequencing, but their primary use is to facilitate the rapid identification of DNA.

For various reasons, not least low-fidelity transcription, the sequences are typically

considerably less reliable than those generated by conventional gene sequencing.

17.1.5

Next Generation Sequencing

With next (or second) generation sequencing (NGS, also known as massively parallel

sequencing or deep sequencing), an entire human genome can be sequenced within

a few hours in the most favourable cases, compared with the ten years or so required

to produce the first final draft of the human genome using conventional Sanger

sequencing. 11 The principle of NGS is not, however, very different from that of the

Sanger method—essentially it is a parallelization of the latter.

In NGS, the DNA is randomly fragmented, either enzymatically or by sonication.

Synthetic double-stranded oligonucleotides of known sequences are attached to the

fragments (adapter ligation) with the help of DNA ligase. The adapters enable the

fragments to become bound to a planar array of complementary counterparts. The

collection of fragments is known as a “library”.

The library must then be “amplified”, using the PCR (Sect. 17.1.2), meaning

the making of many copies of each fragment (in order to ensure sufficiently strong

signals from the subsequent sequencing). Reaction conditions are chosen to favour

the formation of clusters of identical strands.

9 See França et al. (2002) for a review, and Braslavsky et al. (2003) for a single-molecule technique.

10 In this context, “library” is used merely to denote “collection”.

11 The Human Genome Project was completed in 2003; NGS was introduced in 2005.